Computer and Modernization ›› 2012, Vol. 198 ›› Issue (2): 86-89.doi: 10.3969/j.issn.1006-2475.2012.02.023

• 数据库 • Previous Articles     Next Articles

Study of Text Classification Method Based on Rough Set and Improved KNN Algorithm

SHAO Li   

  1. Teaching Affairs Office, Aba Teachers College,Wenchuan, 623000, China
  • Received:2011-09-13 Revised:1900-01-01 Online:2012-02-24 Published:2012-02-24

Abstract: The KNN algorithm is a common method in the field of automatic text classification. It has high classification accuracy for texts with low dimensional vectors. However, when it deals with large numbers of highdimensional texts, the traditional KNN algorithm, due to the need to process considerable the training samples, result in increased similarity calculation and reduced classification efficiency. To solve ensuing problems, this paper uses the rough set method to reduce the attributes of decision table and remove redundant attributes, and then the improved clusterbased KNN algorithm is used to classify texts. Simulation results show that the method can improve the precision and accuracy rate of text classification.

Key words: rough set, improved KNN algorithm, text classification method

CLC Number: